Mapping Greater Sage Grouse (Centrocercus urophasianus) occcurrence and habitat on BLM land in Wyoming¶
The Greater sage-grouse (Centrocercus urophasianus) is a threatened species endemic to the Sagebrush steppe ecosystem of Western North America (Prochazka et al., 2025). The Bureau of Land Management (BLM) manages the largest share of Greater sage-grouse habitat in the United States, a total of 65 million acres (BLM, 2024). I investigated Greater sage-grouse occurrence patterns across BLM land using Global Biodiversity Information Facility (GBIF) data to assess if high occurrence was associated with high-priority BLM habitat.
Credit¶
Original GBIF code workflow credit¶
The basis for this code comes from and Earth Lab / ESIIL Environmental Data Science Assignment, linked here.
Use of AI disclaimer¶
I used Claude Sonnet 4.5 to troubleshoot some of this workflow. It was most helpful in helping clean-up legend scale bars and add clean Natural Earth features into my maps.
Code Workflow Table of Contents¶
Step 0: Define directories and load libraries¶
Step 1: Download GBIF occurrence data¶
- 1.1: Link GBIF to your Jupyter notebook
- 1.2: Check species name before downloading
- 1.3: Download occurrence data
- 1.4: Convert to geodataframe
Step 2: Load in BLM Habitat Management Areas (HMAs)¶
- 2.1: Load data and calculate HMA shape area (m)
- 2.2: BLM region-specific habitat types
- 2.3: Plot by region name
- 2.4: Plot by priority
Step 3: Spatially join BLM HMAs and GBIF observations¶
Step 4: Normalize observation data¶
- 4.1: Density of observations / area of HMA
- 4.2: Remove rare observations
- 4.3: Mean number of occurrences per HMA
- 4.4: Mean number of occurrences per year
- 4.5: Normalize by areal density, mean occurrences per year, and mean occurrences per HMA
Step 5: Map GRSG occurrence across HMAs¶
- 5.1: Join HMAs (geodataframe) to normalized occurrence data
- 5.2: Plot normalized occurrences by HMA and year
- 5.3: Compare with raw occurrence data (GBIF)
- 5.4: Log transform raw occurrence data to smooth hotspots
Step 0.1: Install dependencies and load libraries¶
Need to install before proceeding:
pip install pygbif
import pathlib
import os
import cartopy.crs as ccrs
import cartopy.feature as cfeature # static plot
import colorsys
import hvplot.pandas
import geopandas as gpd
from getpass import getpass
from glob import glob
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors # static plot
from matplotlib.patches import Patch # good legends
import numpy as np
import pandas as pd
import pygbif.occurrences as occ
import pygbif.species as species
import time
import zipfile
### Overall data directory
grsg_dir = os.path.join(
### home directory
pathlib.Path.home(),
### data directory within migration git repository
'Documents',
'Earth Data Cert',
'Earth-Analytics-AY25',
'GitRepos',
'blm-grsg',
'data')
# Make the directory
os.makedirs(grsg_dir, exist_ok = True)
### GBIF Data Directory
gbif_dir = os.path.join(grsg_dir, 'centrocercusUrophasianus_gbif')
# Make the directory
os.makedirs(gbif_dir, exist_ok = True)
### BLM Habitat Management Areas (HMAs) Directory
blm_dir = os.path.join(grsg_dir, 'blm_hmas')
# Make dir
os.makedirs(blm_dir, exist_ok = True)
### Sagebrush Conservation Design (SCD) Directory
scd_dir = os.path.join(grsg_dir, 'scd')
# Make dir
os.makedirs(scd_dir, exist_ok = True)
Step 1: Download GBIF Occurrence Data¶
Step 1.1: link GBIF to your Jupyter notebook¶
- Change to
reset = Truethe first time you run it. When you run it, you'll be prompted to enter your GBIF username, password, and associated email address - Then change it back to
reset = Falseso you can re-run the chunk without having to reconnect to GBIF - You don’t need to modify this code chunk in any other way
####--------------------------####
#### DO NOT MODIFY THIS CODE! ####
####--------------------------####
# This code ASKS for your credentials
# and saves it for the rest of the session.
# NEVER put your credentials into your code!!!!
# GBIF needs a username, password, and email
# All 3 need to match the account
reset = False
# Request and store username
if (not ('GBIF_USER' in os.environ)) or reset:
os.environ['GBIF_USER'] = input('GBIF username:')
# Securely request and store password
if (not ('GBIF_PWD' in os.environ)) or reset:
os.environ['GBIF_PWD'] = getpass('GBIF password:')
# Request and store account email address
if (not ('GBIF_EMAIL' in os.environ)) or reset:
os.environ['GBIF_EMAIL'] = input('GBIF email:')
Step 1.2: Check species name before downloading¶
### grab the species info
backbone = species.name_backbone(name = 'Centrocercus urophasianus')
### check it out
print(backbone)
### pull out the species key
species_key = backbone['usageKey']
### check it out
print("Usage key:", species_key)
{'usageKey': 5959240, 'scientificName': 'Centrocercus urophasianus (Bonaparte, 1827)', 'canonicalName': 'Centrocercus urophasianus', 'rank': 'SPECIES', 'status': 'ACCEPTED', 'confidence': 99, 'matchType': 'EXACT', 'kingdom': 'Animalia', 'phylum': 'Chordata', 'order': 'Galliformes', 'family': 'Phasianidae', 'genus': 'Centrocercus', 'species': 'Centrocercus urophasianus', 'kingdomKey': 1, 'phylumKey': 44, 'classKey': 212, 'orderKey': 723, 'familyKey': 9331, 'genusKey': 5959239, 'speciesKey': 5959240, 'class': 'Aves'}
Usage key: 5959240
Step 1.3: Download occurrence data¶
# Only download once
### set file name for download
gbif_pattern = os.path.join(gbif_dir, '*.csv')
### double check that there isn't already a file that matches this pattern.
### if it already exists, skip the whole conditional
### and go straight to the line: gbif_path = glob(gbif_pattern)[0]
if not glob(gbif_pattern):
### only submit a download request to GBIF once
### if GBIF_DOWNLOAD_KEY is not defined in our environment, make the download request
if not 'GBIF_DOWNLOAD_KEY' in os.environ:
### submit a query to GBIF
gbif_query = occ.download([
### add your species key here
f"speciesKey = {species_key}",
### filter out results that are missing coordinates
"hasCoordinate = True",
### choose a year to include
"year = 2015,2024",
])
os.environ['GBIF_DOWNLOAD_KEY'] = gbif_query[0]
# Wait for the download to build
download_key = os.environ['GBIF_DOWNLOAD_KEY']
### use the occurrence command module in pygbif to get the metadata
wait = occ.download_meta(download_key)['status']
### check if the status of the download = "SUCCEEDED"
### wait and loop through until it finishes
while not wait=='SUCCEEDED':
wait = occ.download_meta(download_key)['status']
### don't want to re-query the API in the loop too frequently
time.sleep(5)
# Download GBIF data when it's ready
download_info = occ.download_get(
os.environ['GBIF_DOWNLOAD_KEY'],
path=gbif_dir)
# Unzip GBIF data using the zipfile package
with zipfile.ZipFile(download_info['path']) as download_zip:
download_zip.extractall(path=gbif_dir)
# Find the extracted .csv file path (take the first result)
gbif_path = glob(gbif_pattern)[0]
# GBIF data path
gbif_path = os.path.join(gbif_dir, "0023259-251120083545085.csv")
# load csv
gbif_data = pd.read_csv(
gbif_path,
delimiter='\t',
index_col='gbifID',
usecols=['gbifID', 'decimalLatitude', 'decimalLongitude', 'month', 'year']
)
gbif_data.head()
| decimalLatitude | decimalLongitude | month | year | |
|---|---|---|---|---|
| gbifID | ||||
| 5892763565 | 43.756638 | -110.667275 | 4 | 2024 |
| 5837855429 | 42.538211 | -110.096143 | 9 | 2024 |
| 5827911939 | 42.105447 | -110.435588 | 7 | 2023 |
| 5759742498 | 45.038401 | -108.962072 | 4 | 2016 |
| 5716416504 | 48.167900 | -106.964000 | 9 | 2016 |
Step 1.4: Convert to geodataframe¶
# Convert to GeoDataFrame
gbif_gdf = (
gpd.GeoDataFrame(
gbif_data,
geometry=gpd.points_from_xy(
gbif_data. decimalLongitude,
gbif_data.decimalLatitude),
crs="EPSG:4326")
# Select the desired columns
[['year','geometry']]
)
gbif_gdf
| year | geometry | |
|---|---|---|
| gbifID | ||
| 5892763565 | 2024 | POINT (-110.66728 43.75664) |
| 5837855429 | 2024 | POINT (-110.09614 42.53821) |
| 5827911939 | 2023 | POINT (-110.43559 42.10545) |
| 5759742498 | 2016 | POINT (-108.96207 45.0384) |
| 5716416504 | 2016 | POINT (-106.964 48.1679) |
| ... | ... | ... |
| 1211973730 | 2015 | POINT (-110.63114 43.64882) |
| 1211972443 | 2015 | POINT (-109.03738 40.48684) |
| 1144280784 | 2015 | POINT (-113.54315 43.28434) |
| 1143532587 | 2015 | POINT (-110.67129 43.75164) |
| 1132408514 | 2015 | POINT (-113.52872 43.5985) |
13686 rows × 2 columns
Step 2: Load in BLM Habitat Management Areas (HMAs)¶
- These were downloaded from the BLM GIS Data Portal
Step 2.1: Lead data and calculate HMA shape area (m)¶
# Read in HMA geopackage
blm_gdf = gpd.read_file(os.path.join(blm_dir,'BLM_WesternUS_GRSG_ROD_HabitatMgmtAreas_Feb_2025_-297793168263603713.gpkg'))
# Check data
blm_gdf.head(10)
| EIS_HAB | Source | Habitat_Type | geometry | |
|---|---|---|---|---|
| 0 | BighornBasin_GH_ROD | ROD_HabitatMgmtAreas_Feb2020 | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... |
| 1 | BighornBasin_PH_ROD | ROD_HabitatMgmtAreas_Feb2020 | PHMA | MULTIPOLYGON (((-1010794.04 2465604.125, -1011... |
| 2 | Billings_GH_ROD | ROD_HabitatMgmtAreas_Feb2020 | GHMA | MULTIPOLYGON (((-966098.796 2643394.812, -9661... |
| 3 | Billings_PH_ROD | ROD_HabitatMgmtAreas_Feb2020 | PHMA | MULTIPOLYGON (((-1012069.205 2549740.156, -101... |
| 4 | Billings_RH_ROD | ROD_HabitatMgmtAreas_Feb2020 | RHMA | MULTIPOLYGON (((-1015485.614 2520246.55, -1015... |
| 5 | Buffalo_GH_ROD | HabitatManagementArea_Update_May2022 | GHMA | MULTIPOLYGON (((-762606.762 2316326.368, -7622... |
| 6 | Buffalo_PH_ROD | HabitatManagementArea_Update_May2022 | PHMA | MULTIPOLYGON (((-800031.171 2485373.319, -7997... |
| 7 | HiLine_PH_ROD | ROD_HabitatMgmtAreas_Feb2020 | PHMA | MULTIPOLYGON (((-902993.058 2866628.093, -9033... |
| 8 | Idaho_GH_NoAction | ROD_HabitatMgmtAreas_Feb2020 | GHMA | MULTIPOLYGON (((-1630138.44 2587646.687, -1630... |
| 9 | Idaho_IH_NoAction | ROD_HabitatMgmtAreas_Feb2020 | IHMA | MULTIPOLYGON (((-1351863.299 2392520.024, -135... |
# Calculate shape area for each EIS_HAB
# Check CRS
print(blm_gdf.crs)
# Calculate area (in units of the CRS - check with blm_gdf.crs)
blm_gdf['area'] = blm_gdf.geometry.area
# Check head
blm_gdf.head(10)
PROJCS["NAD_1983_Albers",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]],PROJECTION["Albers_Conic_Equal_Area"],PARAMETER["latitude_of_center",23],PARAMETER["longitude_of_center",-96],PARAMETER["standard_parallel_1",29.5],PARAMETER["standard_parallel_2",45.5],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH]]
| EIS_HAB | Source | Habitat_Type | geometry | area | |
|---|---|---|---|---|---|
| 0 | BighornBasin_GH_ROD | ROD_HabitatMgmtAreas_Feb2020 | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... | 1.562871e+10 |
| 1 | BighornBasin_PH_ROD | ROD_HabitatMgmtAreas_Feb2020 | PHMA | MULTIPOLYGON (((-1010794.04 2465604.125, -1011... | 7.212493e+09 |
| 2 | Billings_GH_ROD | ROD_HabitatMgmtAreas_Feb2020 | GHMA | MULTIPOLYGON (((-966098.796 2643394.812, -9661... | 1.268364e+10 |
| 3 | Billings_PH_ROD | ROD_HabitatMgmtAreas_Feb2020 | PHMA | MULTIPOLYGON (((-1012069.205 2549740.156, -101... | 3.304169e+09 |
| 4 | Billings_RH_ROD | ROD_HabitatMgmtAreas_Feb2020 | RHMA | MULTIPOLYGON (((-1015485.614 2520246.55, -1015... | 4.122845e+08 |
| 5 | Buffalo_GH_ROD | HabitatManagementArea_Update_May2022 | GHMA | MULTIPOLYGON (((-762606.762 2316326.368, -7622... | 2.040145e+10 |
| 6 | Buffalo_PH_ROD | HabitatManagementArea_Update_May2022 | PHMA | MULTIPOLYGON (((-800031.171 2485373.319, -7997... | 4.409289e+09 |
| 7 | HiLine_PH_ROD | ROD_HabitatMgmtAreas_Feb2020 | PHMA | MULTIPOLYGON (((-902993.058 2866628.093, -9033... | 9.542681e+09 |
| 8 | Idaho_GH_NoAction | ROD_HabitatMgmtAreas_Feb2020 | GHMA | MULTIPOLYGON (((-1630138.44 2587646.687, -1630... | 1.759068e+10 |
| 9 | Idaho_IH_NoAction | ROD_HabitatMgmtAreas_Feb2020 | IHMA | MULTIPOLYGON (((-1351863.299 2392520.024, -135... | 1.828142e+10 |
# Plot real quick
blm_gdf.plot()
<Axes: >
Step 2.2: BLM region-specific habitat types¶
- Control for region specific habitat type designations by assigning a priority value to each for simpler plotting
Habitat Types in our data
| Acronym | Full Name | Region of Use | Assigned Priority Tier |
|---|---|---|---|
| PHMA | Priority Habitat Management Area | Range-wide | High |
| IHMA | Important Habitat Management Area | ID, MT | High |
| AM | Anthro Mountain Habitat Management Area | UT | High |
| RHMA | Restoration Habitat Management Area | MT, ND, SD | High |
| GHMA | General Habitat Management Area | Range-wide | Medium |
| LCHMA | Linkage and Connectivity Habitat Management Area | CO | Medium |
| OHMA | Other Habitat Management Area | NV, CA | Low |
# Simplify names for better plotting
# Split EIS_HAB by underscore into new columns
blm_gdf[['Area', 'HabCode', 'Source']] = blm_gdf['EIS_HAB'].str.split('_', expand=True)
# Create a 'priorty' column that will make plotting cleaner
blm_gdf['priority'] = blm_gdf['HabCode'].map({
'PH': 'High',
'IH': 'High',
'AM': 'High',
'RH': 'High',
'GH': 'Medium',
'LCH': 'Medium',
'OH': 'Low'
})
# Check
blm_gdf.head()
| EIS_HAB | Source | Habitat_Type | geometry | area | Area | HabCode | priority | |
|---|---|---|---|---|---|---|---|---|
| 0 | BighornBasin_GH_ROD | ROD | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... | 1.562871e+10 | BighornBasin | GH | Medium |
| 1 | BighornBasin_PH_ROD | ROD | PHMA | MULTIPOLYGON (((-1010794.04 2465604.125, -1011... | 7.212493e+09 | BighornBasin | PH | High |
| 2 | Billings_GH_ROD | ROD | GHMA | MULTIPOLYGON (((-966098.796 2643394.812, -9661... | 1.268364e+10 | Billings | GH | Medium |
| 3 | Billings_PH_ROD | ROD | PHMA | MULTIPOLYGON (((-1012069.205 2549740.156, -101... | 3.304169e+09 | Billings | PH | High |
| 4 | Billings_RH_ROD | ROD | RHMA | MULTIPOLYGON (((-1015485.614 2520246.55, -1015... | 4.122845e+08 | Billings | RH | High |
Step 2.3: Plot by region name¶
# Plot first by region
# Area color palette
area_colors = {
'BighornBasin': '#A5A58D',
'Billings': '#B7B7A4',
'Buffalo': '#CB997E',
'HiLine': '#DDBEA9',
'Idaho': '#E4BAA0',
'Lander': '#B08968',
'Lewistown': '#997B66',
'MilesCity': '#8C6A5D',
'NinePlan': '#7F6752',
'NorthDakota': '#6F5E50',
'NVCA': '#A3A380',
'SDakota': '#C2C5AA',
'SWMT': '#8DA47E',
'Utah': '#7B8F7A',
'Oregon': '#657B69',
'NWCO': '#556B5A',
}
# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
central_longitude=-110,
central_latitude=40,
standard_parallels=(30, 50)
)
# Convert using that western US albers
blm_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_albers.total_bounds
# Buffer for adjusting the zoom
buffer = 100000 # this in meters
# Create figure with same projection
fig = plt.figure(figsize=(16, 10))
ax = fig.add_subplot(1, 1, 1, projection=projection)
ax.set_extent([
xmin - buffer,
xmax + buffer,
ymin - buffer,
ymax + buffer],
crs=projection)
# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_1_states_provinces_lines',
scale='50m',
facecolor='none'
)
# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
# Plot each polygon - look up color from dictionary
for idx, row in blm_albers.iterrows():
color = area_colors.get(row['Area'], '#999999') # Default gray if not found
blm_albers[blm_albers.index == idx].plot(
ax=ax,
color=color,
edgecolor=color,
linewidth=1,
zorder=2
)
# Create legend using the color mapping
legend_handles = [Patch(facecolor=color, edgecolor='#333333', label=area)
for area, color in sorted(area_colors.items())]
ax.legend(
handles=legend_handles,
loc='lower right',
fontsize=8,
title='Management Area',
title_fontsize=10,
ncol=2,
framealpha=0.9
)
ax.set_title('BLM Greater Sage-Grouse\nHabitat Management Areas',
fontsize=18, fontweight='bold', pad=15)
ax.gridlines(draw_labels=True, linewidth=0.5, color='gray', alpha=0.5, linestyle='--')
plt.tight_layout()
plt.show()
Step 2.4: Plot by priority¶
# Plot priority levels in separate panels
# Priority color palette
prio_colors = {
'Low': '#aa400d',
'Medium': '#987F32',
'High': '#2f501b'
}
# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
central_longitude=-110,
central_latitude=40,
standard_parallels=(30, 50)
)
# Convert using that western US albers
blm_prio_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_prio_albers.total_bounds
# Buffer for adjusting the zoom
buffer = 100000 # this in meters
# Create figure with 3 subplots (1 row, 3 columns)
fig = plt.figure(figsize=(24, 8))
# Define the order of priority levels for panels
priority_order = ['Low', 'Medium', 'High']
# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_1_states_provinces_lines',
scale='50m',
facecolor='none'
)
# Loop through each priority level and create a subplot
for i, priority in enumerate(priority_order, 1):
# Create subplot with projection
ax = fig.add_subplot(1, 3, i, projection=projection)
# Set extent (same for all panels for consistency)
ax.set_extent([
xmin - buffer,
xmax + buffer,
ymin - buffer,
ymax + buffer],
crs=projection)
# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
# Filter data for this priority level
priority_data = blm_prio_albers[blm_prio_albers['priority'] == priority]
# Plot the priority areas
if not priority_data.empty:
priority_data.plot(
ax=ax,
color=prio_colors[priority],
edgecolor=prio_colors[priority],
linewidth=0.1,
zorder=2
)
# Add title for each panel
ax.set_title(f'{priority} Priority',
fontsize=16, fontweight='bold', pad=10)
# Add gridlines
ax.gridlines(draw_labels=True, linewidth=0.5, color='gray',
alpha=0.5, linestyle='--')
# Add overall title
fig.suptitle('BLM Greater Sage-Grouse Habitat Management Areas by Priority Level',
fontsize=20, fontweight='bold', y=0.98)
plt.tight_layout()
plt.show()
# Dissolve polygons by priority level
blm_prio_gdf = blm_gdf.dissolve(by='priority', as_index=False)
# Reduce columns
blm_prio_gdf = blm_prio_gdf[['priority', 'geometry']]
# Re-calulate area for each polygon
blm_prio_gdf['area'] = blm_prio_gdf.geometry.area
# Export as geopackage
blm_prio_gdf.to_file(os.path.join(blm_dir, 'blm_grsg_priority_dissolved.gpkg'), driver='GPKG')
INFO:Created 3 records
# Check
blm_prio_gdf
| priority | geometry | area | |
|---|---|---|---|
| 0 | High | MULTIPOLYGON (((-1724455.818 1921096.47, -1724... | 2.648593e+11 |
| 1 | Low | MULTIPOLYGON (((-2020229.501 2098929.129, -202... | 2.932351e+10 |
| 2 | Medium | MULTIPOLYGON (((-1973743.418 2118522.438, -197... | 2.840932e+11 |
# Plot second by priority
# prior color palette
prio_colors = {
'Low': '#aa400d',
'Medium': '#987F32',
'High': '#2f501b'
}
# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
central_longitude=-110,
central_latitude=40,
standard_parallels=(30, 50)
)
# Convert using that western US albers
blm_prio_albers = blm_prio_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_prio_albers.total_bounds
# Buffer for adjusting the zoom
buffer = 100000 # this in meters
# Create figure with same projection
fig = plt.figure(figsize=(16, 10))
ax = fig.add_subplot(1, 1, 1, projection=projection)
ax.set_extent([
xmin - buffer,
xmax + buffer,
ymin - buffer,
ymax + buffer],
crs=projection)
# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_1_states_provinces_lines',
scale='50m',
facecolor='none'
)
# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
# Plot each polygon - look up color from dictionary
for idx, row in blm_prio_albers.iterrows():
color = prio_colors.get(row['priority'], '#999999') # Default gray if not found
blm_prio_albers[blm_prio_albers.index == idx].plot(
ax=ax,
color=color,
edgecolor=color,
linewidth=0.1,
#zorder=2
)
# Create legend using the color mapping
legend_handles = [Patch(facecolor=color, edgecolor='#333333', label=area)
for area, color in sorted(prio_colors.items())]
ax.legend(
handles=legend_handles,
loc='lower right',
fontsize=8,
title='Priority Level',
title_fontsize=10,
ncol=2,
framealpha=0.9
)
ax.set_title('BLM Greater Sage-Grouse\nHabitat Priority',
fontsize=18, fontweight='bold', pad=15)
ax.gridlines(draw_labels=True, linewidth=0.5, color='gray', alpha=0.5, linestyle='--')
plt.tight_layout()
plt.show()
# Get area by priority from blm_prio_gdf
priority_area = blm_prio_gdf.copy()
# Sort by priority order
priority_order = ['High', 'Medium', 'Low']
priority_area['priority'] = pd.Categorical(
priority_area['priority'],
categories=priority_order,
ordered=True
)
priority_area = priority_area.sort_values('priority')
# Convert area to thousand km²
priority_area['area_thousand_km2'] = priority_area['area'] / 1e9
# Create bar plot
fig, ax = plt.subplots(figsize=(8, 6))
# Get colors for each priority
colors = [prio_colors.get(p, '#999999') for p in priority_area['priority']]
bars = ax.bar(priority_area['priority'], priority_area['area_thousand_km2'],
color=colors, edgecolor=colors, linewidth=1.5)
ax.set_xlabel('Priority Level', fontsize=12, fontweight='bold')
ax.set_ylabel('Area (thousand km²)', fontsize=12, fontweight='bold')
ax.set_title('Habitat Area by Priority Level', fontsize=14, fontweight='bold')
# Add value labels on bars
for bar in bars:
height = bar.get_height()
ax.annotate(f'{height:.1f}',
xy=(bar.get_x() + bar.get_width() / 2, height),
xytext=(0, 3), textcoords="offset points",
ha='center', va='bottom', fontsize=11, fontweight='bold')
plt.tight_layout()
plt.show()
Step 3: Spatially join BLM HMAs and GBIF observations¶
# Spatial join
gbif_hma_gdf = (
blm_gdf
# Match the CRS of the GBIF data and the hmas
# Note: reproject the smaller dataset to avoid time suck
.to_crs(gbif_gdf.crs)
# Find ecoregion for each observation
.sjoin(
gbif_gdf,
how='inner',
predicate='contains')
)
gbif_hma_gdf
| EIS_HAB | Source | Habitat_Type | geometry | area | Area | HabCode | priority | gbifID | year | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | BighornBasin_GH_ROD | ROD | GHMA | MULTIPOLYGON (((-109.05687 44.99036, -109.0568... | 1.562871e+10 | BighornBasin | GH | Medium | 1382647861 | 2015 |
| 0 | BighornBasin_GH_ROD | ROD | GHMA | MULTIPOLYGON (((-109.05687 44.99036, -109.0568... | 1.562871e+10 | BighornBasin | GH | Medium | 1383834604 | 2015 |
| 0 | BighornBasin_GH_ROD | ROD | GHMA | MULTIPOLYGON (((-109.05687 44.99036, -109.0568... | 1.562871e+10 | BighornBasin | GH | Medium | 3168179146 | 2020 |
| 0 | BighornBasin_GH_ROD | ROD | GHMA | MULTIPOLYGON (((-109.05687 44.99036, -109.0568... | 1.562871e+10 | BighornBasin | GH | Medium | 1727319869 | 2016 |
| 0 | BighornBasin_GH_ROD | ROD | GHMA | MULTIPOLYGON (((-109.05687 44.99036, -109.0568... | 1.562871e+10 | BighornBasin | GH | Medium | 1378430833 | 2015 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 37 | NWCO_LCH_2025Update | 2025Update | LCHMA | MULTIPOLYGON (((-108.09005 40.28915, -108.0920... | 1.377762e+09 | NWCO | LCH | Medium | 4091610715 | 2023 |
| 37 | NWCO_LCH_2025Update | 2025Update | LCHMA | MULTIPOLYGON (((-108.09005 40.28915, -108.0920... | 1.377762e+09 | NWCO | LCH | Medium | 4903749174 | 2024 |
| 37 | NWCO_LCH_2025Update | 2025Update | LCHMA | MULTIPOLYGON (((-108.09005 40.28915, -108.0920... | 1.377762e+09 | NWCO | LCH | Medium | 5063829263 | 2019 |
| 37 | NWCO_LCH_2025Update | 2025Update | LCHMA | MULTIPOLYGON (((-108.09005 40.28915, -108.0920... | 1.377762e+09 | NWCO | LCH | Medium | 2139393221 | 2018 |
| 37 | NWCO_LCH_2025Update | 2025Update | LCHMA | MULTIPOLYGON (((-108.09005 40.28915, -108.0920... | 1.377762e+09 | NWCO | LCH | Medium | 2137651098 | 2018 |
10554 rows × 10 columns
Step 4: Normalize observation data¶
Step 4.1: Density of observations / area of HMA¶
# Raw occurrence data
occurrence_df = (
gbif_hma_gdf
# Reset index
.reset_index()
# compare across HMA regions
.groupby(['EIS_HAB', 'year'])
# ...count the number of occurrences
.agg(occurrences=('gbifID', 'count'),
area=('area', 'first'))
)
# Normalize by area
occurrence_df['density'] = (
occurrence_df.occurrences
/ occurrence_df.area
)
occurrence_df
| occurrences | area | density | ||
|---|---|---|---|---|
| EIS_HAB | year | |||
| BighornBasin_GH_ROD | 2015 | 4 | 1.562871e+10 | 2.559392e-10 |
| 2016 | 10 | 1.562871e+10 | 6.398480e-10 | |
| 2017 | 4 | 1.562871e+10 | 2.559392e-10 | |
| 2018 | 4 | 1.562871e+10 | 2.559392e-10 | |
| 2019 | 13 | 1.562871e+10 | 8.318024e-10 | |
| ... | ... | ... | ... | ... |
| Utah_PH_NoAction | 2020 | 135 | 2.283707e+10 | 5.911442e-09 |
| 2021 | 126 | 2.283707e+10 | 5.517346e-09 | |
| 2022 | 173 | 2.283707e+10 | 7.575403e-09 | |
| 2023 | 190 | 2.283707e+10 | 8.319807e-09 | |
| 2024 | 261 | 2.283707e+10 | 1.142879e-08 |
337 rows × 3 columns
Step 4.2: Remove rare observations¶
# Get rid of rare observations (possible misidentification?)
occurrence_df = occurrence_df[occurrence_df.occurrences>1]
occurrence_df
| occurrences | area | density | ||
|---|---|---|---|---|
| EIS_HAB | year | |||
| BighornBasin_GH_ROD | 2015 | 4 | 1.562871e+10 | 2.559392e-10 |
| 2016 | 10 | 1.562871e+10 | 6.398480e-10 | |
| 2017 | 4 | 1.562871e+10 | 2.559392e-10 | |
| 2018 | 4 | 1.562871e+10 | 2.559392e-10 | |
| 2019 | 13 | 1.562871e+10 | 8.318024e-10 | |
| ... | ... | ... | ... | ... |
| Utah_PH_NoAction | 2020 | 135 | 2.283707e+10 | 5.911442e-09 |
| 2021 | 126 | 2.283707e+10 | 5.517346e-09 | |
| 2022 | 173 | 2.283707e+10 | 7.575403e-09 | |
| 2023 | 190 | 2.283707e+10 | 8.319807e-09 | |
| 2024 | 261 | 2.283707e+10 | 1.142879e-08 |
302 rows × 3 columns
Step 4.3: Mean number of occurrences per HMA¶
# Take the mean by priority
mean_occurrences_by_hma = (
occurrence_df
.groupby('EIS_HAB')
.mean()
)
mean_occurrences_by_hma
| occurrences | area | density | |
|---|---|---|---|
| EIS_HAB | |||
| BighornBasin_GH_ROD | 8.200000 | 1.562871e+10 | 5.246754e-10 |
| BighornBasin_PH_ROD | 9.555556 | 7.212493e+09 | 1.324862e-09 |
| Billings_GH_ROD | 6.000000 | 1.268364e+10 | 4.730502e-10 |
| Billings_PH_ROD | 24.700000 | 3.304169e+09 | 7.475405e-09 |
| Buffalo_GH_ROD | 10.444444 | 2.040145e+10 | 5.119461e-10 |
| Buffalo_PH_ROD | 9.666667 | 4.409289e+09 | 2.192341e-09 |
| HiLine_GH_NoAction | 4.555556 | 3.797708e+09 | 1.199554e-09 |
| HiLine_PH_ROD | 35.700000 | 9.542681e+09 | 3.741087e-09 |
| Idaho_GH_NoAction | 32.200000 | 1.759068e+10 | 1.830514e-09 |
| Idaho_IH_NoAction | 35.700000 | 1.828142e+10 | 1.952802e-09 |
| Idaho_PH_NoAction | 87.100000 | 2.435811e+10 | 3.575811e-09 |
| Lander_GH_ROD | 6.400000 | 4.537717e+09 | 1.410401e-09 |
| Lander_PH_ROD | 20.400000 | 8.803880e+09 | 2.317160e-09 |
| Lewistown_GH_ROD | 3.444444 | 4.107464e+09 | 8.385818e-10 |
| Lewistown_PH_ROD | 12.400000 | 4.888587e+09 | 2.536520e-09 |
| MilesCity_GH_ROD | 9.444444 | 4.835951e+10 | 1.952965e-10 |
| MilesCity_PH_ROD | 6.375000 | 1.449690e+10 | 4.397492e-10 |
| MilesCity_RH_ROD | 3.800000 | 1.385128e+09 | 2.743429e-09 |
| NVCA_GH_2022Update | 11.200000 | 3.809569e+10 | 2.939965e-10 |
| NVCA_OH_2022Update | 3.375000 | 2.932351e+10 | 1.150954e-10 |
| NVCA_PH_2022Update | 62.600000 | 5.181665e+10 | 1.208106e-09 |
| NWCO_GH_2025Update | 21.300000 | 6.922874e+09 | 3.076757e-09 |
| NWCO_LCH_2025Update | 2.500000 | 1.377762e+09 | 1.814537e-09 |
| NWCO_PH_2025Update | 213.200000 | 7.749982e+09 | 2.750974e-08 |
| NinePlan_GH_ROD | 89.000000 | 6.522446e+10 | 1.364519e-09 |
| NinePlan_PH_ROD | 80.100000 | 4.173564e+10 | 1.919223e-09 |
| NorthDakota_PH_ROD | 5.666667 | 1.866643e+09 | 3.035753e-09 |
| Oregon_GH_2025Update | 4.714286 | 2.777459e+10 | 1.697338e-10 |
| Oregon_PH_2025Update | 61.300000 | 3.212097e+10 | 1.908411e-09 |
| SDakota_GH_ROD | 2.000000 | 4.750690e+09 | 4.209915e-10 |
| SDakota_PH_ROD | 8.333333 | 3.980246e+09 | 2.093673e-09 |
| SWMT_GH_NoAction | 4.285714 | 5.038864e+09 | 8.505319e-10 |
| SWMT_PH_NoAction | 23.900000 | 5.491400e+09 | 4.352260e-09 |
| Utah_AM_ROD | 2.000000 | 1.703760e+08 | 1.173874e-08 |
| Utah_GH_ROD | 7.900000 | 6.822016e+09 | 1.158016e-09 |
| Utah_PH_NoAction | 147.900000 | 2.283707e+10 | 6.476313e-09 |
Step 4.4: Mean number of occurrences per year¶
# Take the mean by year
mean_occurrences_by_year = (
occurrence_df
.groupby('year')
.mean()
)
mean_occurrences_by_year
| occurrences | area | density | |
|---|---|---|---|
| year | |||
| 2015 | 23.925926 | 1.844972e+10 | 2.136301e-09 |
| 2016 | 23.655172 | 1.919683e+10 | 1.952154e-09 |
| 2017 | 30.433333 | 1.707023e+10 | 2.719608e-09 |
| 2018 | 28.935484 | 1.744539e+10 | 2.550014e-09 |
| 2019 | 25.827586 | 1.738651e+10 | 2.178385e-09 |
| 2020 | 29.548387 | 1.734764e+10 | 2.142464e-09 |
| 2021 | 34.468750 | 1.767348e+10 | 3.050800e-09 |
| 2022 | 40.451613 | 1.796332e+10 | 3.393427e-09 |
| 2023 | 48.866667 | 1.741098e+10 | 4.480251e-09 |
| 2024 | 59.031250 | 1.777261e+10 | 4.332340e-09 |
Step 4.5: Normalize by areal density, mean occurrences per year, and mean occurrences per HMA (aka region)¶
# Normalize by space and time for sampling effort
occurrence_df['norm_occurrences'] = (
occurrence_df[['density']]
/ mean_occurrences_by_year[['density']]
/ mean_occurrences_by_hma[['density']]
)
# Set index - some reason this is appearing as none
#occurrence_df = occurrence_df.set_index(['priority', 'year'])
occurrence_df
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\783070017.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy occurrence_df['norm_occurrences'] = (
| occurrences | area | density | norm_occurrences | ||
|---|---|---|---|---|---|
| EIS_HAB | year | ||||
| BighornBasin_GH_ROD | 2015 | 4 | 1.562871e+10 | 2.559392e-10 | 2.283408e+08 |
| 2016 | 10 | 1.562871e+10 | 6.398480e-10 | 6.247007e+08 | |
| 2017 | 4 | 1.562871e+10 | 2.559392e-10 | 1.793659e+08 | |
| 2018 | 4 | 1.562871e+10 | 2.559392e-10 | 1.912950e+08 | |
| 2019 | 13 | 1.562871e+10 | 8.318024e-10 | 7.277713e+08 | |
| ... | ... | ... | ... | ... | ... |
| Utah_PH_NoAction | 2020 | 135 | 2.283707e+10 | 5.911442e-09 | 4.260416e+08 |
| 2021 | 126 | 2.283707e+10 | 5.517346e-09 | 2.792471e+08 | |
| 2022 | 173 | 2.283707e+10 | 7.575403e-09 | 3.446985e+08 | |
| 2023 | 190 | 2.283707e+10 | 8.319807e-09 | 2.867366e+08 | |
| 2024 | 261 | 2.283707e+10 | 1.142879e-08 | 4.073332e+08 |
302 rows × 4 columns
Step 4.6: Plot occurrences per priority area¶
# Raw occurrence data
occurrence_prio_df = (
gbif_hma_gdf
# Reset index
.reset_index()
# compare across HMA regions
.groupby(['priority', 'year'])
# ...count the number of occurrences
.agg(occurrences=('gbifID', 'count'),
area=('area', 'first'))
)
# Normalize by area
occurrence_prio_df['density'] = (
occurrence_df.occurrences
/ occurrence_df.area
)
occurrence_prio_df.head(10)
| occurrences | area | density | ||
|---|---|---|---|---|
| priority | year | |||
| High | 2015 | 494 | 7.212493e+09 | NaN |
| 2016 | 546 | 7.212493e+09 | NaN | |
| 2017 | 729 | 7.212493e+09 | NaN | |
| 2018 | 703 | 7.212493e+09 | NaN | |
| 2019 | 592 | 7.212493e+09 | NaN | |
| 2020 | 687 | 7.212493e+09 | NaN | |
| 2021 | 902 | 7.212493e+09 | NaN | |
| 2022 | 1071 | 7.212493e+09 | NaN | |
| 2023 | 1212 | 7.212493e+09 | NaN | |
| 2024 | 1448 | 7.212493e+09 | NaN |
# Reset index
plot_data = occurrence_prio_df.reset_index()
# Line plot
fig, ax = plt.subplots(figsize=(10, 6))
# Plot a line for each priority level
for priority, group in plot_data.groupby('priority'):
ax.plot(group['year'], group['occurrences'],
marker='o', linewidth=7, markersize=10,
color=prio_colors.get(priority, '#999999'),
label=priority)
ax.set_xlabel('Year', fontsize=12, fontweight='bold')
ax.set_ylabel('Occurrences', fontsize=12, fontweight='bold')
ax.set_title('Greater Sage-Grouse Occurrences by Year and Priority',
fontsize=14, fontweight='bold')
# Show all years
ax.set_xticks(plot_data['year'].unique())
# Add legend
ax.legend(title='Priority Level', fontsize=10, title_fontsize=11)
# Add grid for readability
ax.grid(True, alpha=0.3, linestyle='--')
plt.tight_layout()
plt.show()
Step 5: Map GRSG occurrence across HMAs¶
Step 5.1: Join HMAs (geodataframe) to normalized occurrence data¶
# Set index for joining next cell
blm_gdf = blm_gdf.set_index('EIS_HAB')
# Reset index if needed
#blm_gdf = blm_gdf.reset_index()
# See index names
print(blm_gdf.index.name)
print(occurrence_df.index.names)
EIS_HAB ['EIS_HAB', 'year']
# Join the occurrences with the plotting GeoDataFrame
occurrence_gdf = blm_gdf.join(occurrence_df[['norm_occurrences']])
# Check to make sure the join worked ok --> there should be no NAs
print("Any NAs:", occurrence_gdf.isna().any())
print("Index Names:", occurrence_gdf.index.names)
occurrence_gdf.head()
Any NAs: Source False Habitat_Type False geometry False area False Area False HabCode False priority False norm_occurrences False dtype: bool Index Names: ['EIS_HAB', 'year']
| Source | Habitat_Type | geometry | area | Area | HabCode | priority | norm_occurrences | ||
|---|---|---|---|---|---|---|---|---|---|
| EIS_HAB | year | ||||||||
| BighornBasin_GH_ROD | 2015 | ROD | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... | 1.562871e+10 | BighornBasin | GH | Medium | 2.283408e+08 |
| 2016 | ROD | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... | 1.562871e+10 | BighornBasin | GH | Medium | 6.247007e+08 | |
| 2017 | ROD | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... | 1.562871e+10 | BighornBasin | GH | Medium | 1.793659e+08 | |
| 2018 | ROD | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... | 1.562871e+10 | BighornBasin | GH | Medium | 1.912950e+08 | |
| 2019 | ROD | GHMA | MULTIPOLYGON (((-1025054.083 2515176.322, -102... | 1.562871e+10 | BighornBasin | GH | Medium | 7.277713e+08 |
Step 5.2: Plot normalized occurrences by HMA and year¶
# Custom color plover color palette
custom_palette = [
"#bfc09f",
"#aeb38e",
"#9da67d",
"#8c996d",
"#7b8c5e",
"#6a8050",
"#597441",
"#4a6834",
"#3c5c27",
"#2f501b",
"#234510",
"#183a06"
]
# Create a custom colormap from the palette
cmap = mcolors.ListedColormap(custom_palette)
# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
central_longitude=-110,
central_latitude=40,
standard_parallels=(30, 50)
)
# Reset index to make 'year' accessible as a column
occurrence_with_year = occurrence_gdf.reset_index()
# Convert using that western US albers
occurrence_albers = occurrence_with_year.to_crs(projection)
xmin, ymin, xmax, ymax = occurrence_albers.total_bounds
# Add buffer for padding
buffer = 100000 # meters
# Create figure with 3x3 subplots
fig = plt.figure(figsize=(20, 16))
# Create state boundaries feature (define once outside loop)
states_provinces = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_1_states_provinces_lines',
scale='50m',
facecolor='none'
)
# Create a subplot for each year
# Note: excluding 2015 just to make the 3x3 plot look better
for i, year_num in enumerate(range(2016, 2025)):
# Create subplot with Albers projection
ax = fig.add_subplot(3, 3, i + 1, projection=projection)
# Filter data for current year
year_data = occurrence_albers[occurrence_albers['year'] == year_num]
# Set extent to keep zoom consistent
ax.set_extent([
xmin - buffer,
xmax + buffer,
ymin - buffer,
ymax + buffer],
crs=projection)
# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
# Plot the occurrence data
if len(year_data) > 0:
year_data.plot(
ax=ax,
column='norm_occurrences',
cmap=cmap,
legend=False,
zorder=2
)
# Add year as title for each panel
ax.set_title(year_num, fontsize=14, fontweight='bold')
# Add gridlines
ax.gridlines(draw_labels=False, linewidth=0.5, color='gray', alpha=0.5, linestyle='--')
# Add overall title
fig.suptitle('Greater Sage Grouse Occurrence, 2015-2024',
fontsize=20, fontweight='bold', y=0.995)
# Add a single colorbar for all subplots
fig.subplots_adjust(right=0.92, hspace=0.25, wspace=0.15)
cbar_ax = fig.add_axes([0.94, 0.15, 0.015, 0.7])
sm = plt.cm.ScalarMappable(
cmap=cmap,
norm=mcolors.Normalize(
vmin=occurrence_with_year['norm_occurrences'].min(),
vmax=occurrence_with_year['norm_occurrences'].max()
)
)
sm.set_array([])
cbar = fig.colorbar(sm, cax=cbar_ax)
cbar.set_label('Normalized occurrence', fontsize=12, fontweight='bold')
# Adjust layout
plt.tight_layout(rect=[0, 0, 0.92, 0.99])
# Show the plot
plt.show()
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\4241496113.py:106: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect. plt.tight_layout(rect=[0, 0, 0.92, 0.99])
Step 5.2 (part 2): check if these data look normal - there seems to be a strange pattern in the normalized figure¶
# Just curious if these data look normal - trying to figure out this pattern
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Histogram
axes[0].hist(occurrence_gdf['norm_occurrences'].dropna(), bins=50, edgecolor='black')
axes[0].set_xlabel('Normalized Occurrences')
axes[0].set_ylabel('Frequency')
axes[0].set_title('Distribution of Normalized Occurrences')
axes[0].axvline(occurrence_gdf['norm_occurrences'].mean(),
color='red', linestyle='--', label='Mean')
axes[0].axvline(occurrence_gdf['norm_occurrences'].median(),
color='orange', linestyle='--', label='Median')
axes[0].legend()
# Box plot by year
occurrence_gdf.reset_index().boxplot(column='norm_occurrences', by='year', ax=axes[1])
axes[1].set_xlabel('Year')
axes[1].set_ylabel('Normalized Occurrences')
axes[1].set_title('Distribution by Year')
plt.suptitle('')
plt.tight_layout()
plt.show()
Step 5.3: Compare with raw occurrence data (GBIF)¶
# Plot density of GBIF observations per year using hexbins
# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
central_longitude=-110,
central_latitude=40,
standard_parallels=(30, 50)
)
# Convert GBIF data to Albers projection
gbif_albers = gbif_gdf.to_crs(projection)
# Get bounds from BLM data for consistent extent
blm_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_albers.total_bounds
# Add buffer for padding
buffer = 100000 # meters
# Create figure with 3x3 subplots (for years 2016-2024)
fig = plt.figure(figsize=(20, 16))
# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_1_states_provinces_lines',
scale='50m',
facecolor='none'
)
# Create a subplot for each year
for i, year_num in enumerate(range(2016, 2025)):
# Create subplot with Albers projection
ax = fig.add_subplot(3, 3, i + 1, projection=projection)
# Filter data for current year
year_data = gbif_albers[gbif_albers['year'] == year_num]
# Set extent to keep zoom consistent
ax.set_extent([
xmin - buffer,
xmax + buffer,
ymin - buffer,
ymax + buffer],
crs=projection)
# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
# Plot BLM habitat areas in light gray as context
blm_albers.plot(ax=ax, facecolor='none', edgecolor='gray',
linewidth=0.5, alpha=0.3, zorder=1)
# Plot observation density using hexbin
if len(year_data) > 0:
x = year_data.geometry.x
y = year_data.geometry.y
hexbin = ax.hexbin(
x, y,
gridsize=30,
cmap='YlOrRd',
mincnt=1,
alpha=0.7,
zorder=3,
edgecolors='none'
)
# Add year and count
ax.set_title(f'{year_num} (n={len(year_data)})',
fontsize=14, fontweight='bold')
# Add gridlines
ax.gridlines(draw_labels=False, linewidth=0.5,
color='gray', alpha=0.5, linestyle='--')
# Add title
fig.suptitle('Greater Sage-Grouse GBIF Observation Density by Year',
fontsize=20, fontweight='bold', y=0.995)
# Add a colorbar
fig.subplots_adjust(right=0.92, hspace=0.25, wspace=0.15)
cbar_ax = fig.add_axes([0.94, 0.15, 0.015, 0.7])
cbar = fig.colorbar(hexbin, cax=cbar_ax)
cbar.set_label('Observations per hexagon', fontsize=12, fontweight='bold')
# Adjust layout
plt.tight_layout(rect=[0, 0, 0.92, 0.99])
# Show the plot
plt.show()
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\1277146570.py:91: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect. plt.tight_layout(rect=[0, 0, 0.92, 0.99])
Step 5.4: Log transform raw occurrence data to smooth hotspots¶
# Plot density of GBIF observations per year using hexbins
# Albers projection for the western US
projection = ccrs.AlbersEqualArea(
central_longitude=-110,
central_latitude=40,
standard_parallels=(30, 50)
)
# Convert GBIF data to Albers projection
gbif_albers = gbif_gdf.to_crs(projection)
# Get bounds from BLM data for consistent extent
blm_albers = blm_gdf.to_crs(projection)
xmin, ymin, xmax, ymax = blm_albers.total_bounds
# Add buffer for padding
buffer = 100000 # meters
# Create figure with 3x3 subplots (for years 2016-2024)
fig = plt.figure(figsize=(20, 16))
# Create state boundaries feature
states_provinces = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_1_states_provinces_lines',
scale='50m',
facecolor='none'
)
# Create a subplot for each year
for i, year_num in enumerate(range(2016, 2025)):
# Create subplot with Albers projection
ax = fig.add_subplot(3, 3, i + 1, projection=projection)
# Filter data for current year
year_data = gbif_albers[gbif_albers['year'] == year_num]
# Set extent to keep zoom consistent
ax.set_extent([
xmin - buffer,
xmax + buffer,
ymin - buffer,
ymax + buffer],
crs=projection)
# Add background features
ax.add_feature(cfeature.LAND, facecolor='#f5f5f5', zorder=0)
ax.add_feature(cfeature.BORDERS, linewidth=2, edgecolor='#888888', zorder=2)
ax.add_feature(cfeature.COASTLINE, linewidth=1.5, edgecolor='#888888', zorder=1)
ax.add_feature(states_provinces, linewidth=1.5, edgecolor='#888888', zorder=1)
# Plot BLM habitat areas in light gray as context
blm_albers.plot(ax=ax, facecolor='none', edgecolor='gray',
linewidth=0.5, alpha=0.3, zorder=1)
# Plot observation density using hexbin
if len(year_data) > 0:
x = year_data.geometry.x
y = year_data.geometry.y
hexbin = ax.hexbin(
x, y,
gridsize=30,
cmap='YlOrRd',
mincnt=1,
alpha=0.7,
zorder=3,
edgecolors='none',
norm=mcolors.LogNorm()
)
# Add year and count
ax.set_title(f'{year_num} (n={len(year_data)})',
fontsize=14, fontweight='bold')
# Add gridlines
ax.gridlines(draw_labels=False, linewidth=0.5,
color='gray', alpha=0.5, linestyle='--')
# Add title
fig.suptitle('Greater Sage-Grouse GBIF Observation Density by Year',
fontsize=20, fontweight='bold', y=0.995)
# Add a colorbar
fig.subplots_adjust(right=0.92, hspace=0.25, wspace=0.15)
cbar_ax = fig.add_axes([0.94, 0.15, 0.015, 0.7])
cbar = fig.colorbar(hexbin, cax=cbar_ax)
cbar.set_label('Observations per hexagon', fontsize=12, fontweight='bold')
# Adjust layout
plt.tight_layout(rect=[0, 0, 0.92, 0.99])
# Show the plot
plt.show()
C:\Users\naho5798\AppData\Local\Temp\ipykernel_76000\2667436826.py:92: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect. plt.tight_layout(rect=[0, 0, 0.92, 0.99])